{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Getting started with TensorFlow\n",
    "\n",
    "**Learning Objectives**\n",
    "  1. Practice defining and performing basic operations on constant Tensors\n",
    "  1. Use Tensorflow's automatic differentiation capability\n",
    "  1. Learn how to train a linear regression from scratch with TensorFLow\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this notebook, we will start by reviewing the main operations on Tensors in TensorFlow and understand how to manipulate TensorFlow Variables. We explain how these are compatible with python built-in list and numpy arrays. \n",
    "\n",
    "Then we will jump to the problem of training a linear regression from scratch with gradient descent. The first order of business will be to understand how to compute the gradients of a function (the loss here) with respect to some of its arguments (the model weights here). The TensorFlow construct allowing us to do that is `tf.GradientTape`, which we will describe. \n",
    "\n",
    "At last we will create a simple training loop to learn the weights of a 1-dim linear regression using synthetic data generated from a linear model. \n",
    "\n",
    "As a bonus exercise, we will do the same for data generated from a non linear model, forcing us to manual engineer non-linear features to improve our linear model performance."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!sudo chown -R jupyter:jupyter /home/jupyter/training-data-analyst"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Ensure the right version of Tensorflow is installed.\n",
    "!pip freeze | grep tensorflow==2.5"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from matplotlib import pyplot as plt\n",
    "import tensorflow as tf"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(tf.__version__)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Operations on Tensors"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Variables and Constants"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Tensors in TensorFlow are either contant (`tf.constant`) or variables (`tf.Variable`).\n",
    "Constant values can not be changed, while variables values can be.\n",
    "\n",
    "The main difference is that instances of `tf.Variable` have methods allowing us to change \n",
    "their values while tensors constructed with `tf.constant` don't have these methods, and\n",
    "therefore their values can not be changed. When you want to change the value of a `tf.Variable`\n",
    "`x` use one of the following method: \n",
    "\n",
    "* `x.assign(new_value)`\n",
    "* `x.assign_add(value_to_be_added)`\n",
    "* `x.assign_sub(value_to_be_subtracted`\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "x = tf.constant([2, 3, 4])\n",
    "x"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "x = tf.Variable(2.0, dtype=tf.float32, name='my_variable')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Lab Task #1:** Use the `assign(..)` method to assign a new value to the variable `x` you created above. After each, print `x` to verify how the value changes. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# TODO 1\n",
    "x.assign( # TODO: Your code goes here. \n",
    "x"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# TODO 2\n",
    "x.assign( # TODO: Your code goes here. \n",
    "x"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# TODO 3\n",
    "x.assign( # TODO: Your code goes here. \n",
    "x"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Point-wise operations"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Tensorflow offers similar point-wise tensor operations as numpy does:\n",
    "    \n",
    "* `tf.add` allows to add the components of a tensor \n",
    "* `tf.multiply` allows us to multiply the components of a tensor\n",
    "* `tf.subtract` allow us to substract the components of a tensor\n",
    "* `tf.math.*` contains the usual math operations to be applied on the components of a tensor\n",
    "* and many more...\n",
    "\n",
    "Most of the standard aritmetic operations (`tf.add`, `tf.substrac`, etc.) are overloaded by the usual corresponding arithmetic symbols (`+`, `-`, etc.)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Lab Task #2:** Create two tensorflow constants `a = [5, 3, 8]` and `b = [3, -1, 2]`. Then, compute \n",
    "1. the sum of the constants `a` and `b` below using `tf.add` and `+` and verify both operations produce the same values.\n",
    "2. the product of the constants `a` and `b` below using `tf.multiply` and `*` and verify both operations produce the same values.\n",
    "3. the exponential of the constant `a` using `tf.math.exp`. Note, you'll need to specify the type for this operation.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# TODO 1\n",
    "a = # TODO: Your code goes here.\n",
    "b = # TODO: Your code goes here.\n",
    "c = # TODO: Your code goes here.\n",
    "d = # TODO: Your code goes here.\n",
    "\n",
    "print(\"c:\", c)\n",
    "print(\"d:\", d)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# TODO 2\n",
    "a = # TODO: Your code goes here.\n",
    "b = # TODO: Your code goes here.\n",
    "c = # TODO: Your code goes here.\n",
    "d = # TODO: Your code goes here.\n",
    "\n",
    "print(\"c:\", c)\n",
    "print(\"d:\", d)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# TODO 3\n",
    "# tf.math.exp expects floats so we need to explicitly give the type\n",
    "a = # TODO: Your code goes here.\n",
    "b = # TODO: Your code goes here.\n",
    "\n",
    "print(\"b:\", b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### NumPy Interoperability\n",
    "\n",
    "In addition to native TF tensors, tensorflow operations can take native python types and NumPy arrays as operands. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# native python list\n",
    "a_py = [1, 2] \n",
    "b_py = [3, 4] "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Lab Task #3:** Use `tf.add` to compute the sum of the native python arrays `a` and `b`. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# TODO 1\n",
    "# TODO: Your code goes here."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# numpy arrays\n",
    "a_np = np.array([1, 2])\n",
    "b_np = np.array([3, 4])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Lab Task #4:** Use `tf.add` to compute the sum of the NumPy arrays `a` and `b`. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# TODO 1\n",
    "# TODO: Your code goes here."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# native TF tensor\n",
    "a_tf = tf.constant([1, 2])\n",
    "b_tf = tf.constant([3, 4])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Lab Task #5:** Use `tf.add` to compute the sum of the Tensorflow constants `a` and `b`. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# TODO 1\n",
    "# TODO: Your code goes here."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can convert a native TF tensor to a NumPy array using .numpy()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "a_tf.numpy()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Linear Regression\n",
    "\n",
    "Now let's use low level tensorflow operations to implement linear regression.\n",
    "\n",
    "Later in the course you'll see abstracted ways to do this using high level TensorFlow."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Toy Dataset\n",
    "\n",
    "We'll model the following function:\n",
    "\n",
    "\\begin{equation}\n",
    "y= 2x + 10\n",
    "\\end{equation}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X = tf.constant(range(10), dtype=tf.float32)\n",
    "Y = 2 * X + 10\n",
    "\n",
    "print(\"X:{}\".format(X))\n",
    "print(\"Y:{}\".format(Y))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's also create a test dataset to evaluate our models:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X_test = tf.constant(range(10, 20), dtype=tf.float32)\n",
    "Y_test = 2 * X_test + 10\n",
    "\n",
    "print(\"X_test:{}\".format(X_test))\n",
    "print(\"Y_test:{}\".format(Y_test))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Loss Function"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The simplest model we can build is a model that for each value of x returns the sample mean of the training set:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "y_mean = Y.numpy().mean()\n",
    "\n",
    "\n",
    "def predict_mean(X):\n",
    "    y_hat = [y_mean] * len(X)\n",
    "    return y_hat\n",
    "\n",
    "Y_hat = predict_mean(X_test)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Using mean squared error, our loss is:\n",
    "\\begin{equation}\n",
    "MSE = \\frac{1}{m}\\sum_{i=1}^{m}(\\hat{Y}_i-Y_i)^2\n",
    "\\end{equation}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For this simple model the loss is then:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "errors = (Y_hat - Y)**2\n",
    "loss = tf.reduce_mean(errors)\n",
    "loss.numpy()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This values for the MSE loss above will give us a baseline to compare how a more complex model is doing."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, if $\\hat{Y}$ represents the vector containing our model's predictions when we use a linear regression model\n",
    "\\begin{equation}\n",
    "\\hat{Y} = w_0X + w_1\n",
    "\\end{equation}\n",
    "\n",
    "we can write a loss function taking as arguments the coefficients of the model:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def loss_mse(X, Y, w0, w1):\n",
    "    Y_hat = w0 * X + w1\n",
    "    errors = (Y_hat - Y)**2\n",
    "    return tf.reduce_mean(errors)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Gradient Function\n",
    "\n",
    "To use gradient descent we need to take the partial derivatives of the loss function with respect to each of the weights. We could manually compute the derivatives, but with Tensorflow's automatic differentiation capabilities we don't have to!\n",
    "\n",
    "During gradient descent we think of the loss as a function of the parameters $w_0$ and $w_1$. Thus, we want to compute the partial derivative with respect to these variables. \n",
    "\n",
    "For that we need to wrap our loss computation within the context of `tf.GradientTape` instance which will reccord gradient information:\n",
    "\n",
    "```python\n",
    "with tf.GradientTape() as tape:\n",
    "    loss = # computation \n",
    "```\n",
    "\n",
    "This will allow us to later compute the gradients of any tensor computed within the `tf.GradientTape` context with respect to instances of `tf.Variable`:\n",
    "\n",
    "```python\n",
    "gradients = tape.gradient(loss, [w0, w1])\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We illustrate this procedure with by computing the loss gradients with respect to the model weights:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Lab Task #6:** Complete the function below to compute the loss gradients with respect to the model weights `w0` and `w1`. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# TODO 1\n",
    "def compute_gradients(X, Y, w0, w1):\n",
    "    # TODO: Your code goes here."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "w0 = tf.Variable(0.0)\n",
    "w1 = tf.Variable(0.0)\n",
    "\n",
    "dw0, dw1 = compute_gradients(X, Y, w0, w1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"dw0:\", dw0.numpy())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"dw1\", dw1.numpy())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Training Loop\n",
    "\n",
    "Here we have a very simple training loop that converges. Note we are ignoring best practices like batching, creating a separate test set, and random weight initialization for the sake of simplicity."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Lab Task #7:** Complete the `for` loop below to train a linear regression. \n",
    "1. Use `compute_gradients` to compute `dw0` and `dw1`.\n",
    "2. Then, re-assign the value of `w0` and `w1` using the `.assign_sub(...)` method with the computed gradient values and the `LEARNING_RATE`.\n",
    "3. Finally, for every 100th step , we'll compute and print the `loss`. Use the `loss_mse` function we created above to compute the `loss`. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# TODO 1\n",
    "STEPS = 1000\n",
    "LEARNING_RATE = .02\n",
    "MSG = \"STEP {step} - loss: {loss}, w0: {w0}, w1: {w1}\\n\"\n",
    "\n",
    "\n",
    "w0 = tf.Variable(0.0)\n",
    "w1 = tf.Variable(0.0)\n",
    "\n",
    "\n",
    "for step in range(0, STEPS + 1):\n",
    "\n",
    "    dw0, dw1 = #TODO: Your code goes here.\n",
    "    #TODO: Your code goes here.\n",
    "    #TODO: Your code goes here.\n",
    "\n",
    "    if step % 100 == 0:\n",
    "        loss = #TODO: Your code goes here.\n",
    "        print(MSG.format(step=step, loss=loss, w0=w0.numpy(), w1=w1.numpy()))\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now let's compare the test loss for this linear regression to the test loss from the baseline model that outputs always the mean of the training set:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "loss = loss_mse(X_test, Y_test, w0, w1)\n",
    "loss.numpy()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This is indeed much better!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Bonus"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Try modelling a non-linear function such as: $y=xe^{-x^2}$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X = tf.constant(np.linspace(0, 2, 1000), dtype=tf.float32)\n",
    "Y = X * tf.exp(-X**2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "\n",
    "plt.plot(X, Y)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def make_features(X):\n",
    "    f1 = tf.ones_like(X)  # Bias.\n",
    "    f2 = X\n",
    "    f3 = tf.square(X)\n",
    "    f4 = tf.sqrt(X)\n",
    "    f5 = tf.exp(X)\n",
    "    return tf.stack([f1, f2, f3, f4, f5], axis=1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def predict(X, W):\n",
    "    return tf.squeeze(X @ W, -1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def loss_mse(X, Y, W):\n",
    "    Y_hat = predict(X, W)\n",
    "    errors = (Y_hat - Y)**2\n",
    "    return tf.reduce_mean(errors)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def compute_gradients(X, Y, W):\n",
    "    with tf.GradientTape() as tape:\n",
    "        loss = loss_mse(Xf, Y, W)\n",
    "    return tape.gradient(loss, W)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# TODO 2\n",
    "STEPS = 2000\n",
    "LEARNING_RATE = .02\n",
    "\n",
    "\n",
    "Xf = make_features(X)\n",
    "n_weights = Xf.shape[1]\n",
    "\n",
    "W = tf.Variable(np.zeros((n_weights, 1)), dtype=tf.float32)\n",
    "\n",
    "# For plotting\n",
    "steps, losses = [], []\n",
    "plt.figure()\n",
    "\n",
    "\n",
    "for step in range(1, STEPS + 1):\n",
    "\n",
    "    dW = compute_gradients(X, Y, W)\n",
    "    W.assign_sub(dW * LEARNING_RATE)\n",
    "\n",
    "    if step % 100 == 0:\n",
    "        loss = loss_mse(Xf, Y, W)\n",
    "        steps.append(step)\n",
    "        losses.append(loss)\n",
    "        plt.clf()\n",
    "        plt.plot(steps, losses)\n",
    "\n",
    "\n",
    "print(\"STEP: {} MSE: {}\".format(STEPS, loss_mse(Xf, Y, W)))\n",
    "\n",
    "plt.figure()\n",
    "plt.plot(X, Y, label='actual')\n",
    "plt.plot(X, predict(Xf, W), label='predicted')\n",
    "plt.legend()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Copyright 2019 Google Inc. Licensed under the Apache License, Version 2.0 (the \"License\"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
